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ABSTRACT 


Viral metagenomics, modeling of protein structure, and manipulation of viral genetics are key approaches that 
have laid the foundations of our understanding of coronavirus biology. In this review, we discuss the major 
advances each method has provided and discuss how future studies should leverage these strategies synergis- 
tically to answer novel questions. 


1. Introduction 


The Severe Acute Respiratory Syndrome (SARS) epidemic first 
emerged in southern China in late 2002 and rapidly spread world-wide, 
causing 8096 confirmed cases in 27 countries and resulting in 774 
deaths (World Health Organization, 2004). Defined as a novel cor- 
onavirus (CoV) (Drosten et al., 2003; Ksiazek et al., 2003; Peiris et al., 
2003a; Rota et al., 2003), SARS-CoV was generally agreed to have 
originated in bats and highlighted the risk posed by viruses emerging 
from zoonotic sources (Guan et al., 2003; Lau et al., 2005; Li et al., 
2005b; Tang et al., 2006; Tu et al., 2004). A decade later, Middle East 
Respiratory Syndrome CoV (MERS-CoV) was identified as the causative 
agent of another ongoing outbreak (Zaki et al., 2012). In the five years 
since, significant progress has been made in understanding the origins, 
biology, and emergence potential of CoVs. These studies have been 
aided by advancements in three critical research areas: viral metage- 
nomics, structural modeling studies, and reverse genetics. In this re- 
view, we detail novel insights recently defined by advances in each 
approach and discuss their impacts on our understanding of CoV in- 
fection and emergence. We also consider how these strategies can be 
integrated to better prepare for the next emergent CoV strain. 


2. Exploring an unknown frontier 


Viral metagenomics has greatly expanded the scope and under- 
standing of CoVs. Before 2002, the CoV family consisted of a relatively 
modest number of viruses infecting the airway or the fecal-oral tracts. 
With the only known human CoV strains causing mild disease, the fa- 
mily had not been considered a significant threat to human public 
health. In 2002, SARS initially presented itself as an atypical 


pneumonia for which no known causal agent could be determined 
(Peiris et al., 2003b). Eventually, the novel CoV was isolated from in- 
fected patients and sequenced, demonstrating SARS to be caused by a 
genetically distinct CoV of unknown origin (Peiris et al., 2003a; Rota 
et al., 2003). With the earliest cases of SARS occurring in food service 
workers handling exotic animals, initial studies focused on surveying 
animals in live markets for the presence of SARS-CoV progenitors using 
traditional viral discovery methods, including seropositivity studies, 
isolation in culture, visualization of virions using electron microscopy 
(Guan et al., 2003; Tu et al., 2004). Such viral discovery efforts led to 
the identification of a SARS-CoV strain in a Himalayan palm civet that 
shared 99.8% nucleotide identity with the epidemic strain (Guan et al., 
2003). Later, the observation that neither farmed nor wild civets har- 
bored SARS antibodies outside of live animal markets led to the in- 
vestigation of other zoonotic sources for SARS-CoV (Poon et al., 2005; 
Tu et al., 2004). Studies quickly identified SARS progenitors circulating 
in bats belonging to the Rhinolophus genus (commonly referred to as 
horseshoe bats). Full genome comparisons determined that these pro- 
genitor strains had similar genome organization to and a high nucleo- 
tide sequence identity (88-92%) with SARS-CoV, suggesting that the 
epidemic strain emerged from these bat CoV populations (Lau et al., 
2005; Li et al., 2005b; Ren et al., 2006; Tang et al., 2006). Together, 
these studies established the classic model of SARS-CoV emergence, 
whereby civets initially infected with SARS-CoV served as intermediate 
hosts, leading to the generation of an adapted strain capable of human 
infection. 

The discovery of the progenitor SARS-like CoV strains circulating in 
Chinese bat populations led to a global effort to identify and define the 
phylogenetic relationships of the Coronaviridae family. Over the past 
15 years, dozens of animal populations have been surveyed, and the 
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Fig. 1. The expanding phylogeny of coronaviruses. The Spike protein sequences of 83 coronaviruses were aligned and phylogenetically compared. The four coronavirus genera are grouped 
in shades of orange (Alphacoronavirus), blue (Betacoronavirus), red (Gammacoronavirus), and green (Deltacoronavirus). Classic subgroup designations (1a-b, 2a-d, 3, and 4) are also 
shown. Sequences designated as 1b* group with 1b viruses in proteins other than Spike. Individual viral species and strains are colored based on the original publication dates of their 
sequences: black (pre-2002), red (2002-2005), blue (2006-2011), green (2012-2014), and purple (2015-2017). Sequences were aligned using the MUSCLE package in Geneious R9. The 
tree was constructed using the neighbor-joining method based on the multiple sequence alignment, also in Geneious R9. Numbers in parentheses following virus species names indicate 
the number of sequences represented at that tree position. The radial phylogram was visualized and rendered for publication using CLC Sequence Viewer 7 and Adobe Illustrator CC 2017. 


full-length genomes of numerous novel CoVs have been identified, 
greatly expanding the CoV family tree (Drexler et al., 2014). In contrast 
to the identification of SARS-CoV which relied on traditional discovery 
methods, the expansion of the CoV family tree has largely occurred 
through viral metagenomics, that is the through the direct examination 
of CoV genetic material obtained from environmental samples 
(Edwards and Rohwer, 2005; Simmonds et al., 2017). Many of these 
studies have relied on PCR based assays targeting conserved CoV se- 
quences such as the RdRp gene (Drexler et al., 2014); the advent of 
inexpensive high-throughput deep sequencing methods has and will 
likely increasingly be exploited to study CoV populations (Alagaili 
et al., 2014; Anthony et al., 2013; Briese et al., 2014; Cotten et al., 
2013a, 2013b, 2014; Donaldson et al., 2010). Currently, the Cor- 
onaviridae family is divided into 4 unique clades designated the Alpha-, 
Beta-, Gamma-, and Deltacoronaviruses (referred to hereafter by their 
historical designation as Groups 1-4, respectively) (Fig. 1) and include 
viruses known to infect humans, bats, other mammals, and several 
avian species. Among these, only CoVs designated in black were iden- 
tified prior to the emergence of SARS-CoV. During the outbreak 
(2002-2005, red) and its immediate aftermath (2006-2011, green), a 
number of SARS-CoV-related and progenitor strains were identified and 
formed the core of the new group 2b branch. Similarly, identification of 
novel bat CoV sequences, including HKU4, HKU5, and HKU9-CoVs, 
populated newly formed groups. Likewise, the discovery of both HCoV- 
NL63 and HCoV-HKU1, which cause minor diseases in humans, 


expanded upon CoV groups that already existed. Together, viral me- 
tagenomic studies in the wake of the SARS-CoV epidemic provided the 
first robust look into the existing CoV phylogeny and provided a fra- 
mework for understanding the sources of CoV emergence. 

This expanded phylogeny born from metagenomic studies of SARS- 
CoVs allowed for the rapid identification of the MERS-CoV as a group 
2C CoV. Similar in sequence to HKU4 and HKU5-CoV, this novel human 
CoV was quickly distinguished from SARS-CoV; HKU4 and HKU5 were 
used as reagents to verify and characterize the new strain (Agnihothram 
et al., 2014a, 2014b). Further study led to the determination that 
HKU4, but not HKUS5, could bind the MERS-CoV receptor, human di- 
peptidyl peptidase 4 (DPP4) (Wang et al., 2014). Just as SARS-CoV 
infection was traced to food service workers handling exotic animals, 
the observation that many MERS patients had been in contact with 
dromedary camels led to the search for MERS-CoV progenitors in camel 
populations (Azhar et al., 2014; Reusken et al., 2013). Camel herds 
throughout the Middle East were found to have MERS-CoV neutralizing 
antibodies and to harbor CoVs with nearly identical sequence to MERS- 
CoV (Azhar et al., 2014; Haagmans et al., 2014; Hemida et al., 2014; 
Reusken et al., 2013). Examination of historical serum samples sug- 
gested that MERS-CoV had been present in dromedaries since the 1980s 
and likely originated in Eastern Africa, traveling to Saudi Arabia via the 
camel trade (Muller et al., 2014). Together, the emergence of MERS- 
CoV highlighted the utility of expanding the CoV phylogeny through 
viral metagenomic studies. 
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While a number of studies had provided evidence for SARS-CoV's 
origin in bats, viral metagenomic surveys continued to add detail to the 
CoV tree. Despite having high sequence identity, potential SARS-CoV 
progenitors, such as HKU3, had < 70% amino acid identity within the 
S1 domain of the spike (S) protein (Lau et al., 2005; Li et al., 2005b; Ren 
et al., 2006; Tang et al., 2006) and could not bind either the civet or the 
human receptor, angiotensin-converting enzyme 2 (ACE2) (Lu et al., 
2015; Ren et al., 2008). However, recent surveys in East Asia further 
expanded the CoV tree and identified several new group 2B CoVs with 
higher sequence identity to SARS-CoV S1. Four such virus sequence 
clusters have been identified: SHCO14, LYRall, WIV1, and WIV16, 
with 82.4%, 84.4%, 86.5%, and 95.4% amino acid identity with SARS- 
CoV S1, respectively (Ge et al., 2013; He et al., 2014; Yang et al., 2015). 
Importantly, three CoVs (SHC014, WIV1, and WIV16) have been shown 
experimentally to bind civet and human ACE2, suggesting that the 
epidemic strain SARS-CoV may have emerged from one of the quasi- 
species pools in the SARS-like CoV population (Ge et al., 2013; Yang 
et al., 2015). Similarly, more distantly related MERS-like CoVs have 
been found in bats throughout Africa, which possibly spread MERS-CoV 
progenitor virus(es) to camels in the region (Annan et al., 2013; 
Corman et al., 2014; Ithete et al., 2013). Two recent studies discovered 
the MERS-like CoVs NeoCoV and PREDICT/PDF-2180; each share > 
85% nucleotide identity with MERS-CoV (Anthony et al., 2017; 
Corman et al., 2014) but have low sequence identity within MERS S1. 
Neither virus has been shown to utilize human DPP4 as a receptor, 
suggesting that they are not likely to emerge in humans. Even the ori- 
gins of other human CoVs have recently been linked to bats; several 
CoV sequences were isolated from Kenyan bats and found to be closely 
related to HCoV-NL63 and HCoV-229E (Tao et al., 2017). Together, the 
continuation of viral metagenomic surveys provides a critical resource 
to quickly place novel strains and to define the origins of emergent 
viruses. 

Since the emergence of SARS-CoV in 2002, viral metagenomics has 
provided an indispensable tool for the study of CoVs. Examinations of 
CoVs currently circulating in animal populations using culture in- 
dependent sequencing techniques have allowed the field to trace the 
zoonotic origins of human CoVs to progenitor strains circulating in bats, 
providing insights into their emergence. Viral metagenomics has also 
greatly expanded our understanding of the diversity of the 
Coronaviridae family exemplified by the rapid classification of MERS- 
CoV within the existing group 2C clade. This designation permitted 
application of reagents against similar group 2C CoVs and had im- 
plications for characterization and treatment. Importantly, the ex- 
istence of SARS like- and MERS like-CoVs circulating in zoonotic po- 
pulations indicates a continued threat for the emergence and 
reemergence of CoVs. Using viral metagenomic techniques, surveillance 
can possibly identify and help predict the next emergent CoV strains. 


3. Coronavirus spike structure: examining host tropism and 
epitope discovery 


Studying protein structure has long been useful for understanding 
biological functions and developing therapeutics against emergent 
viruses, including influenza, Ebola, and Zika (Kang et al., 2017; 
Saphire, 2013; Wu and Wilson, 2017). Notably, CoV research pioneered 
these types of studies in the context of an emerging virus outbreak, with 
structural models of the SARS-CoV Spike (S) protein providing insight 
into its transition to new hosts and its neutralization (Li, 2013; Lu et al., 
2015). Recently, advances in structural biology and cryo-electron mi- 
croscopy (cryo-EM) have permitted the recovery of CoV S proteins in 
their trimeric conformation (Kirchdoerfer et al., 2016; Pallesen et al., 
2017; Walls et al., 2016a, 2016b; Yuan et al., 2017). The resulting ef- 
forts have not only improved our understanding of CoV biology but 
have also opened novel avenues for therapeutic treatments. 

Following the original SARS-CoV epidemic, structural studies of the 
S protein provided insights into viral emergence and avenues to 
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therapeutic treatment. The CoV S protein is subdivided into an S1 do- 
main, primarily responsible for receptor binding, and an S2 domain, 
which is responsible for the fusion of the viral and cellular membranes 
(Lu et al., 2015). In 2005, the crystal structure of the SARS-CoV re- 
ceptor-binding domain (RBD) bound to human ACE2 indicated the 
presence of two subdomains, a core domain and the receptor-binding 
motif (RBM). Interacting with the N-terminal lobe of human ACE2, 14 
RBM residues interact with 18 residues on human ACE2 to promote 
binding (Li et al., 2005a). Subsequent studies demonstrated that the 
primary barriers to host transition are differences in ACE2 sequences 
between species, necessitating changes in both topology and charge 
within the RBM (Li, 2008; Li et al., 2005c; Qu et al., 2005; Wu et al., 
2012). Importantly, inefficient binding to both bat and mouse ACE2 by 
the epidemic SARS-CoV strains suggested that mutations were required 
for emergence (Frieman et al., 2012; Hou et al., 2010; Li, 2013; Roberts 
et al., 2007). Aided by the solved SARS-CoV S structure, epitope map- 
ping of previously known SARS-CoV monoclonal neutralizing anti- 
bodies (mAbs) revealed that they bound primarily to the RBD, likely 
disrupting interaction with ACE2 (Cao et al., 2010; He et al., 2006, 
2005b; Lu et al., 2004). In concordance, vaccine experiments with the 
SARS-CoV RBD induced neutralizing antibodies and protected against 
viral challenge (Du et al., 2007; He et al., 2005a, 2004; Zakhartchouk 
et al., 2007). In addition, mAbs targeting the RBD were also developed 
for prophylaxis; however, these mAbs were susceptible to escape mu- 
tations and lacked neutralization capacity against zoonotic SARS-CoV 
strains, thus limiting their value (Rockx et al., 2008, 2010; Sui et al., 
2014). Coupled with the identification of additional SARS like-CoVs 
circulating in bat populations (Lau et al., 2005; Li et al., 2005b; Ren 
et al., 2008; Tang et al., 2006), the initial therapeutics developed to 
target the SARS-CoV RBD are unlikely to protect against newly emer- 
gent infections. However, these structural studies have provided im- 
portant insights that have informed both studies of CoV emergence and 
therapeutic treatments against future emergent strains. 

As was the case for viral metagenomics, structural studies ex- 
amining SARS-CoV S provided a blueprint for investigations into the 
MERS-CoV S. The MERS-CoV RED was rapidly identified, and its crystal 
structure was solved (Chen et al., 2013; Lu et al., 2013; Wang et al., 
2013). The MERS-CoV RBD can also be divided into two subdomains: 
an RBM that binds the MERS-CoV receptor, DPP4, and a core domain 
with remarkable structural similarity to the SARS-CoV core domain. 
Eighteen amino acids within the RBM interact with 13 residues on 
DPP4 to promote binding (Lu et al., 2013; Wang et al., 2013). While 
differences in host sequences are again a barrier, DPP4 is relatively 
conserved among mammals (Barlan et al., 2014; Falzarano et al., 2014; 
Miiller et al., 2012; Raj et al., 2014; van Doremalen et al., 2014). 
Common small animal models are a notable exception, with mice, rats, 
hamsters, and ferrets all encoding DPP4 proteins that cannot support 
infection, constituting a significant barrier to MERS-CoV research 
(Barlan et al., 2014; Coleman et al., 2014; de Wit et al., 2013; van 
Doremalen et al., 2014). Like SARS-CoV, the MERS-CoV RBD is also a 
strong immunogen; several vaccine studies have demonstrated its po- 
tential to induce neutralizing antibodies (Du et al., 2013a, 2013b; Mou 
et al., 2013). Together, these observations illustrate how structural 
prediction can be utilized for the development of vaccines and neu- 
tralizing antibodies. 

Recently, advances in structural studies have produced a wealth of 
new CoV S structures (Kirchdoerfer et al., 2016; Pallesen et al., 2017; 
Walls et al., 2016a, 2016b; Yuan et al., 2017). While previous efforts 
had been made to explore CoV S proteins, these studies had elucidated 
only portions of S, including the post-fusion core and RBDs bound to 
receptors. While cryo-EM studies of SARS-CoV virions provided insights 
into the S glycoprotein, the lack of high-resolution CoV S trimer 
structures had limited progress in understanding entry and infection 
(Beniac et al., 2007; Neuman et al., 2006). Several groups recently 
overcame this barrier by fusing trimerization motifs into the CoV S 
(Kirchdoerfer et al., 2016; Pallesen et al., 2017; Walls et al., 2016a, 
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Fig. 2. Coronavirus trimer, a structure providing new insights. A) Overall structure of the 
SARS-CoV Spike ectodomain trimer as previously described (Yuan et al., 2017) was 
subdivided into the N-terminal domain (NTD, orange), receptor-binding domain (RBD, 
red), C-terminal to S1 cleavage (CTS1, magenta), and entire S2 domain (gray). B & C) Two 
previously predicted confirmation states of the SARS RBD regions with the B) “lying” 
conformation and C) “standing” conformation. D) Spike protein sequences of the in- 
dicated viruses were aligned according to the bounds of the NTD, RBD, CTS1, S1, and S2. 
Sequence identities were extracted from the alignments, and a heatmap of sequence 
identity using SARS-CoV-Urbani as the reference sequence was constructed using Evol- 
View (www.evolgenius.info/evolview). The heatmap was further rendered and edited in 
Adobe Illustrator CC 2017. 
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2016b; Yuan et al., 2017); the resulting chimeric proteins had increased 
stability, permitting the first characterizations of the CoV S in trimeric 
form. These initial studies provided novel insights into S proteolytic 
cleavage, viral fusion/entry, and conservation with other viral entry 
proteins. Subsequently, these studies spurred a wealth of new data and 
analysis. 

Solving of the CoV S trimer provides a model upon which to base 
new hypotheses and analyses. The initial SARS-CoV S trimer structure 
indicated that the position of the SARS-CoV RBD within the S trimer 
may be dynamic, with the RBD shuffling between and an exposed 
“standing” state and a “lying” conformation, where the RBD is buried 
between N-terminal domains within the trimer (Yuan et al., 2017) 
(Fig. 2). With receptor binding predicted to occur only in the “standing” 
position, neutralizing antibodies that bind the N-terminal domain 
(NTD) and prevent conformational switching may prove uniquely ef- 
fective; however, a lack of conservation across the S1 NTD of group 2B 
viruses suggests this domain may be a poor target (Fig. 2D). In contrast, 
the structure of the S trimer suggested that the SARS-CoV fusion peptide 
(FP) and heptad repeat 1 (HR1) regions of S2 are exposed at the surface 
of the S trimer, indicating the opportunity to identify neutralization 
epitopes on a highly conserved region of the S protein (Yuan et al., 
2017). The MERS-CoV S trimer was also solved recently by two separate 
groups. Like SARS-CoV, the MERS-CoV RBD also alternates between 
standing and lying states and has exposed residues in the FP and HR1 
domains in S2 (Pallesen et al., 2017; Yuan et al., 2017). Importantly, a 
neutralizing antibody targeting the S2 region was thoroughly char- 
acterized and has potential as a prophylaxis agent (Pallesen et al., 
2017). In addition, studies examining the HCoV-NL63 S trimer suggest 
that CoVs may utilize glycosylation to prevent recognition of 
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immunogenic epitopes by the host (Walls et al., 2016b). With similar 
glycosylation sites found on the SARS and MERS-CoV S trimers, these 
structural studies highlight areas on the S protein where antibody 
binding may be inhibited through glycan shielding (Yuan et al., 2017). 
Together, these structural data provide critical insights for the devel- 
opment of therapeutics against both current and potentially emergent 
CoVs. 

Overall, structural studies examining CoV S glycoproteins have 
provided invaluable insights into both protein function and therapeutic 
design. The structures of the SARS- and MERS-CoV RBDs defined key 
correlates impacting host tropism and immunogenic features for vac- 
cine and therapeutic studies. Moving forward, the solution of the tri- 
meric forms of CoV S proteins creates an opportunity to better under- 
stand and model viral entry and fusion. Importantly, observations of S 
topology and glycosylation can be exploited in the development of 
universal therapeutics and vaccines. Together, the application of find- 
ings from structural studies has the potential to help identify, mitigate, 
or potentially prevent the next CoV outbreak. 


3.1. Deriving insights by manipulating viral genetics 


The use of reverse genetic systems (RGSs) to produce infectious 
particles and manipulate the genetic composition of the virus is an in- 
dispensable tool in virology (Perez, 2017). For CoVs, several systems 
have been developed (Almazan et al., 2014) and used to assess CoV 
protein function, design therapeutics, and evaluate the emergence po- 
tential of novel CoVs. Together, these RGS platforms have been key in 
characterizing and understanding CoVs infection in the context of re- 
cent outbreaks. 

For many years, several barriers, including CoVs’ large size 
(~30 Kb) and the existence of toxic elements within the genome pro- 
moted genetic instability, limiting the development of CoV RGSs. While 
several other robust systems exist (Almazan et al., 2014), our labora- 
tories have primarily utilized a sub-cloning strategy originally devel- 
oped for transmissible gastroenteritis virus (TGEV) and subsequently 
deployed for other CoVs, including SARS-CoV and MERS-CoV (Beall 
et al., 2016; Becker et al., 2008; Scobey et al., 2013; Yount et al., 2000, 
2003, 2002). Briefly, the full-length CoV genome is divided into cDNAs 
and cloned into separate plasmids with class IIG or IIS restriction sites 
added to each end. Fragments are then directionally assembled into a 
full-length cDNA CoV genome by in vitro ligation. The CoV genome is 
subsequently transcribed, and full-length RNA is electroporated into 
cells to produce viable viruses (Almazan et al., 2014). Importantly, 
fragments can be strategically divided within toxic and unstable ele- 
ments, breaking up these sequences to achieve stable propagation of the 
sub-clones. The use of smaller plasmids for propagation and targeted 
mutagenesis also limits the accumulation of undesired mutations during 
bacterial expansion and maintains fidelity with the source CoV se- 
quence. Together, this and other RGSs provide critical tools needed to 
understand CoV infection and pathogenesis. 

With the development of RGSs, mutations could be easily made in 
the context of emergent CoVs utilizing traditional cloning methods. For 
instance, reporter strains for both SARS-CoV and MERS-CoV were 
quickly generated by replacing “accessory” ORFs with reporter genes, 
including GFP, RFP, and luciferase (Scobey et al., 2013; Sims et al., 
2005; Yount et al., 2006). Similarly, RGSs have been key in the creation 
of mouse-adapted CoVs. Following in vivo passage, reverse genetics was 
used to reintroduce adaptation mutations into the SARS-CoV and 
MERS-CoV clones. These studies preserved a more uniform virus po- 
pulation and permitted the evaluation of the roles of individual muta- 
tions in mouse adaptation (Cockrell et al., 2016; Day et al., 2009; 
Frieman et al., 2012; Roberts et al., 2007). Reverse genetics has also 
been useful for the analysis of viral protein function, identifying key 
roles in viral antagonism of interferon (IFN) responses, inflammation, 
and host processes (Snijder et al., 2016). For example, ablation of 
nonstructural protein 16 (nsp16) function by replacing key residues at 
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its active sites sensitized murine hepatitis virus (MHV) and SARS-CoV 
to the Type I IFN response, attenuated viral replication in vivo, and 
portended nsp16 mutants as a potential live-attenuated vaccine plat- 
form (Menachery et al., 2017, 2014; Ziist et al., 2011). Similarly, RGSs 
have been used to develop and characterize two other live-attenuated 
vaccine strategies via the deletion of SARS-CoV E or the inactivation of 
the exonuclease (ExoN) activity encoded within nsp14 (DeDiego et al., 
2007; Fett et al., 2013; Graham et al., 2012; Netland et al., 2010). 
Together, RGSs have been key in characterizing CoV infection and 
defining viral protein function in the context of infection. 

RGSs also have utility in analyzing the emergence and pathogenic 
potential of zoonotic CoVs. Because of the plasticity afforded by the 
sub-cloning system, sequences derived from zoonotic CoVs could be 
evaluated in the context of viable CoV genomes. For example, human- 
civet SARS-CoV chimeras were used to demonstrate that the S gene of 
civet strains cannot efficiently mediate viral replication in cells ex- 
pressing human ACE2, even within a human strain's backbone, sug- 
gesting that S mutations were critical for human emergence of SARS- 
CoV (Sheahan et al., 2008). Additionally, a significant barrier to the 
study of zoonotic CoVs, including HKU3-CoV and HKU5-CoV, was the 
difficulty in finding a viable culture system either due to receptor 
compatibility or possible issues with the overall viral genome. Using 
reverse genetics, substitution of minimal S portions was used to over- 
come receptor-binding issues, with de novo synthesized HKU3 and 
HKUS5 genomes being made viable by substituting in portions of the 
SARS-CoV S genes (Fig. 3A). When assembled, these chimeric viruses 
could infect and replicate efficiently in vitro and in vivo, demonstrating 
that receptor binding was a primary barrier for HKU3 and HKU5 in- 
fection of human cells (Agnihothram et al., 2014b; Becker et al., 2008). 
While insertion the SARS-CoV RBD was sufficient to confer replication 
competency to HKU3, the entire SARS-CoV ectodomain was required 
for HKU5, suggesting that multiple domains of the S protein may work 
in concert across a CoV group (Agnihothram et al., 2014b). A com- 
plementary strategy was used to test the capacity of CoV S genes to 
mediate infection independent of their backbones (Fig. 3B). Recently 
utilized with SHC014-CoV and WIV1-CoV, the S genes of zoonotic CoVs 
were inserted into a replication-competent backbone, the mouse 
adapted SARS-CoV MA15. The SHC014-MA15 and WIV1-MA15 chi- 
meras could replicate in vitro and in vivo, suggesting that these viruses 
are poised for human emergence (Menachery et al., 2015, 2016). These 
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Fig. 3. Dual approaches to leverage reverse genetics. Utilizing coronavirus molecular clones, 
two strategies have been employed to explore the emergence and pathogenic potential of 
sequences derived from zoonotic populations. A) Replacing the wild-type spike proteins, 
this strategy explores the capacity of the spike proteins within the context of a viral 
backbone known to be capable of replication. These studies provide insights into the 
potential of spike proteins to mediate infection of human cells and cause in vivo disease 
and aid in examinations of the broad efficacy of therapeutics directed against CoV spike 
proteins. B) Utilizing portions or whole spike proteins of replication-competent CoVs, this 
approach examines the capacity of the viral backbone in mediating infection and pa- 
thogenesis. These studies provide insights into whether the backbone has the capacity to 
infect and cause disease if paired with receptor binding/entry. This approach can also 
evaluate the efficacy of therapeutics targeting portions of the CoV genome other than 
spike. Both approaches have been used to examine bat viruses currently circulating in 
animal populations around the world. 
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initial studies justified further examinations and characterizations of 
full-length SHCO14-CoV and WIV1-CoV and indicated that mutations in 
the viral backbones are also required for emergence and pathogenesis. 
Together, these two strategies leverage reverse genetics to create chi- 
meric coronaviruses, bypassing the limitations of species specific cul- 
ture systems to analyze the emergence potential of zoonotic CoVs. 

As demonstrated above, RGSs are an indispensable tool for the 
characterization of both human and zoonotic CoVs. Future studies will 
continue to exploit the utility of reverse genetics to investigate viral 
protein function and to identify CoV therapeutic candidates, including 
live-attenuated vaccine candidates based on viral protein inactivation 
or deletion. Additionally, the practice of creating chimeric viruses 
consisting of viable portions from established CoVs in conjunction with 
zoonotic sequences can greatly enhance the utility of metagenomic 
studies. Construction of these chimeras could also prove useful in 
vaccine and therapeutic development, as a candidate's efficacy against 
zoonotic strains may predict its utility against future emergent viruses. 
However, reverse genetic studies involving the creation of these chi- 
meras raise biosafety concerns, particularly in light of the recent pause 
on gain of function studies associated with influenza and coronaviruses. 
While risks of research of this nature should not be taken lightly (Weiss 
et al., 2015), it is important to take into consideration that the ex- 
periments described above have provided invaluable information re- 
garding zoonosis (Racaniello, 2016). Manipulation of viral pathogens 
using reverse genetics systems is a proven strategy for the character- 
ization of pathogens and the development of therapeutics and the de- 
bate about its future role in biomedical research should be discussed in 
an evidence based fashion (Casadevall and Imperiale, 2014). Future 
studies need to be designed with oversight and discussion from the 
scientific community and should seek to strike a balance between the 
utility of the information gained with the potential risks involved. 


4. Concluding remarks 


Since the emergence of SARS-CoV, metagenomics, structural, and 
reverse genetics studies have been critical research approaches in the 
study of CoVs. Investigators have utilized viral metagenomics to define 
the evolutionary histories of many human CoV strains and has been 
instrumental in identifying numerous zoonotic CoVs circulating in an- 
imal populations. Researchers have built structural models of the S 
protein that have provided molecular explanations for host tropism and 
have identified epitope candidates for therapeutic development. 
Creation of reverse genetics systems by members of the field has been 
critical for the manipulation of viral sequences, for enhancing our un- 
derstanding of CoV protein function and host adaptation, and for de- 
veloping live-attenuated vaccine platforms. Together, studies utilizing 
these strategies have informed our current understanding of CoV 
emergence, pathogenesis, and treatment. 

While these strategies have been used individually, future work can 
take advantage of their complementary nature. For instance, recent 
findings with MERS-CoV indicate its exploitation of a2,6-linked sialic 
acids acts as a secondary receptor through the N-terminal domain 
(NTD) of S1, which is structurally conserved (Li et al., 2017) but di- 
vergent in sequence among CoVs (Fig. 2D) (Li et al., 2017; Walls et al., 
2016a, 2016b; Yuan et al., 2017). Reverse genetics can be used to create 
mutants and chimeras to determine the effect sialic acid binding has on 
host tropism and to determine if this function of the NTD is conserved 
across similar zoonotic CoVs strains. Similarly, creating chimeric 
viruses through reverse genetics has already proven vital in the study of 
zoonotic viruses (Agnihothram et al., 2014b; Becker et al., 2008; 
Menachery et al., 2015, 2016; Sheahan et al., 2008); efforts to explore 
changes in the structure of proteins from zoonotic strains relative to the 
established strains may reveal structural requirements necessary for 
emergence. Reverse genetic and structural studies, by identifying con- 
served features associated with emergence and pathogenesis, can help 
identify which CoVs identified in animal populations are likely to pose a 
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public health risk and merit further study. Additionally, insights from 
these strategies can be applied to existing experimental systems. For 
example, variation in host response to infection is currently a major 
area of research (Aylor et al., 2011; Menachery and Baric, 2013) and 
models of host genetic diversity, such as collaborative cross (CC) mouse 
panel, have identified genetic loci that modulate SARS-CoV disease 
outcome (Gralinski et al., 2015, 2017; Xiong et al., 2014). Applying 
insights from viral metagenomics, structure, and reverse genetics, offer 
the opportunity to utilize the CC to identify host genes that contribute 
to emergence of zoonotic CoVs. Similarly, host-pathogen interactions 
may influence and modulate both viral sequence and structure; reverse 
genetic systems can be utilized to confirm these hypotheses. Together, 
the synergistic use of metagenomics, structural biology, and reverse 
genetic systems has significant potential to identify the molecular de- 
terminates of CoV infection and pathogenesis. Integration of these three 
strategies can help characterize pre-emergent CoV populations, al- 
lowing to the field to make predictions about which zoonotic CoVs are 
likely to emerge, prepare for future outbreaks, and will facilitate the 
development of therapeutic strategies against CoV infection. 
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